Large pre-trained models, such as Bert, GPT, and Wav2Vec, have demonstrated great potential for learning representations that are transferable to a wide variety of downstream tasks . It is difficult to obtain a large quantity of supervised data due to the limited availability of resources and time. In light of this, a significant amount of research has been conducted in the area of adopting large pre-trained datasets for diverse downstream tasks via fine tuning, linear probing, or prompt tuning in low resource settings. Normalization techniques are essential for accelerating training and improving the generalization of deep neural networks and have been successfully used in a wide variety of applications. A lot of normalization techniques have been proposed but the success of normalization in low resource downstream NLP and speech tasks is limited. One of the reasons is the inability to capture expressiveness by rescaling parameters of normalization. We propose KullbackLeibler(KL) Regularized normalization (KL-Norm) which make the normalized data well behaved and helps in better generalization as it reduces over-fitting, generalises well on out of domain distributions and removes irrelevant biases and features with negligible increase in model parameters and memory overheads. Detailed experimental evaluation on multiple low resource NLP and speech tasks, demonstrates the superior performance of KL-Norm as compared to other popular normalization and regularization techniques.
translated by 谷歌翻译
Neural Networks (GNNs) have revolutionized the molecular discovery to understand patterns and identify unknown features that can aid in predicting biophysical properties and protein-ligand interactions. However, current models typically rely on 2-dimensional molecular representations as input, and while utilization of 2\3- dimensional structural data has gained deserved traction in recent years as many of these models are still limited to static graph representations. We propose a novel approach based on the transformer model utilizing GNNs for characterizing dynamic features of protein-ligand interactions. Our message passing transformer pre-trains on a set of molecular dynamic data based off of physics-based simulations to learn coordinate construction and make binding probability and affinity predictions as a downstream task. Through extensive testing we compare our results with the existing models, our MDA-PLI model was able to outperform the molecular interaction prediction models with an RMSE of 1.2958. The geometric encodings enabled by our transformer architecture and the addition of time series data add a new dimensionality to this form of research.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
当今智能城市中产生的大型视频数据从其有目的的用法角度引起了人们的关注,其中监视摄像机等是最突出的资源,是为大量数据做出贡献的最突出的资源,使其自动化分析成为计算方面的艰巨任务。和精确。暴力检测(VD)在行动和活动识别域中广泛崩溃,用于分析大型视频数据,以了解由于人类而引起的异常动作。传统上,VD文献基于手动设计的功能,尽管开发了基于深度学习的独立模型的进步用于实时VD分析。本文重点介绍了深度序列学习方法以及检测到的暴力的本地化策略。该概述还介入了基于机器学习的初始图像处理和基于机器学习的文献及其可能具有的优势,例如针对当前复杂模型的效率。此外,讨论了数据集,以提供当前模型的分析,并用对先前方法的深入分析得出的VD域中的未来方向解释了他们的利弊。
translated by 谷歌翻译
COVID-19导致与不同的SARS-COV-2变体相关的多种感染波。研究报告了这些变体对患者呼吸健康的影响不同。我们探索从COVID-19受试者收集的声学信号是否显示出可区分的声学模式,这表明有可能预测潜在的病毒变体。我们分析了从三个主题库中收集的COSWARA数据集,即i)健康,ii)在三角洲变体占主导地位期间记录的covid-199受试者,以及III)来自Omicron Expear中记录的COVID-19的数据。我们的发现表明,咳嗽,呼吸和语音等多种声音类别表明,在将COVID-19与Omicron和Delta变体进行比较时,声音特征差异很大。在曲线下,分类区域大大超过了被Omicron感染的受试者与三角洲感染者的机会。使用来自多个声音类别的得分融合,我们在95%的特异性下获得了89%和52.4%的敏感性的区域。此外,使用分层三类方法将声学数据分类为健康和共同-19阳性,并将进一步的COVID受试者分为三角洲和Omicron变体,从而提供了高水平的3类分类精度。这些结果提出了设计基于声音的COVID-19诊断方法的新方法。
translated by 谷歌翻译
语言模型既展示了定量的改进,又展示了新的定性功能,随着规模的增加。尽管它们具有潜在的变革性影响,但这些新能力的特征却很差。为了为未来的研究提供信息,为破坏性的新模型能力做准备,并改善社会有害的效果,至关重要的是,我们必须了解目前和近乎未来的能力和语言模型的局限性。为了应对这一挑战,我们介绍了超越模仿游戏基准(Big Bench)。 Big Bench目前由204个任务组成,由132家机构的442位作者贡献。任务主题是多样的,从语言学,儿童发展,数学,常识性推理,生物学,物理学,社会偏见,软件开发等等。 Big-Bench专注于被认为超出当前语言模型的功能的任务。我们评估了OpenAI的GPT型号,Google内部密集变压器体系结构和大型基础上的开关稀疏变压器的行为,跨越了数百万到数十亿个参数。此外,一个人类专家评估者团队执行了所有任务,以提供强大的基准。研究结果包括:模型性能和校准都随规模改善,但绝对的术语(以及与评估者的性能相比);在模型类中的性能非常相似,尽管带有稀疏性。逐渐和预测的任务通常涉及大量知识或记忆成分,而在临界规模上表现出“突破性”行为的任务通常涉及多个步骤或组成部分或脆性指标;社交偏见通常会随着含糊不清的环境而随着规模而增加,但这可以通过提示来改善。
translated by 谷歌翻译
COVID-19大流行已经加快了关于替代,快速有效的Covid-19诊断方法设计的研究。在本文中,我们描述了Coswara工具,这是一个网站应用程序,旨在通过分析呼吸声样本和健康症状来启用COVID-19检测。使用此服务的用户可以使用连接到Internet的任何设备登录到网站,提供当前的健康症状信息,并记录很少有对应于呼吸,咳嗽和语音的声音。在分析此信息上的一分钟内,网站工具将向用户输出COVID-19概率分数。随着COVID-19的大流行继续要求进行大规模和可扩展的人口水平测试,我们假设所提出的工具为此提供了潜在的解决方案。
translated by 谷歌翻译
蛋白质 - 配体相互作用(PLIS)是生化研究的基础,其鉴定对于估计合理治疗设计的生物物理和生化特性至关重要。目前,这些特性的实验表征是最准确的方法,然而,这是非常耗时和劳动密集型的。在这种情况下已经开发了许多计算方法,但大多数现有PLI预测大量取决于2D蛋白质序列数据。在这里,我们提出了一种新颖的并行图形神经网络(GNN),以集成PLI预测的知识表示和推理,以便通过专家知识引导的深度学习,并通过3D结构数据通知。我们开发了两个不同的GNN架构,GNNF是采用不同特种的基础实现,以增强域名认识,而GNNP是一种新颖的实现,可以预测未经分子间相互作用的先验知识。综合评价证明,GNN可以成功地捕获配体和蛋白质3D结构之间的二元相互作用,对于GNNF的测试精度和0.958,用于预测蛋白质 - 配体络合物的活性。这些模型进一步适用于回归任务以预测实验结合亲和力,PIC50对于药物效力和功效至关重要。我们在实验亲和力上达到0.66和0.65的Pearson相关系数,分别在PIC50和GNNP上进行0.50和0.51,优于基于2D序列的模型。我们的方法可以作为可解释和解释的人工智能(AI)工具,用于预测活动,效力和铅候选的生物物理性质。为此,我们通过筛选大型复合库并将我们的预测与实验测量数据进行比较来展示GNNP对SARS-COV-2蛋白靶标的实用性。
translated by 谷歌翻译
我们概述了新兴机会和挑战,以提高AI对科学发现的效用。AI为行业的独特目标与AI科学的目标创造了识别模式中的识别模式与来自数据的发现模式之间的紧张。如果我们解决了与域驱动的科学模型和数据驱动的AI学习机之间的“弥补差距”相关的根本挑战,那么我们预计这些AI模型可以改变假说发电,科学发现和科学过程本身。
translated by 谷歌翻译
Quadruped robots are currently used in industrial robotics as mechanical aid to automate several routine tasks. However, presently, the usage of such a robot in a domestic setting is still very much a part of the research. This paper discusses the understanding and virtual simulation of such a robot capable of detecting and understanding human emotions, generating its gait, and responding via sounds and expression on a screen. To this end, we use a combination of reinforcement learning and software engineering concepts to simulate a quadruped robot that can understand emotions, navigate through various terrains and detect sound sources, and respond to emotions using audio-visual feedback. This paper aims to establish the framework of simulating a quadruped robot that is emotionally intelligent and can primarily respond to audio-visual stimuli using motor or audio response. The emotion detection from the speech was not as performant as ERANNs or Zeta Policy learning, still managing an accuracy of 63.5%. The video emotion detection system produced results that are almost at par with the state of the art, with an accuracy of 99.66%. Due to its "on-policy" learning process, the PPO algorithm was extremely rapid to learn, allowing the simulated dog to demonstrate a remarkably seamless gait across the different cadences and variations. This enabled the quadruped robot to respond to generated stimuli, allowing us to conclude that it functions as predicted and satisfies the aim of this work.
translated by 谷歌翻译